#pyGnGAnalysis

Simple data analysis within a nice python environment

#Example Call

Currently the program is launched iteractivly from the python console following the sheme

```
import TrialAnalysis

TrialAnalysis.TrialBatchAnalysis('/somwhere/masterDirectory', yamlFile='whereever/TrialAnalysis.yaml', exportFullTableAsCSV = True)
```


#Execution Details
For each Session in the masterDirectory a TrialAnalysis is run making up the following steps

* **First a `Session Table` is generated containing all matlab+ethovision data for a single experiment session** 
    * This is done by first loading the matlab and ethovision files independently. For the matlab data a column `Duration` is added that specifies the duration between two current and the next loop.
    * Based on the Field MouseID a matlab and ethovision file will be joined to a single `EssayTable` (So for each mouse used in a Session there will be an `EssayTable` that at least contains the matlab data and if present contains the ethovision data)
    * The matlab and ethovision data is joined in a way that the loop timing in the matlab data is preserved, and for each 'matlab loop' the closest ethovision rows are aggregated. For each column in the ethovision data its aggregation method is deduced from its content (see autoDetectResampleMethods)

* **The `Session Table` is than used to run the TableAnalysis**
    * First the configuration yaml file is loaded and depending on the columns referenced in the config subsections ['TrialAggregation', 'PeriodAggregation', 'ExtendedTrialAggregation'] a empty result table will be generated and later filled. Columns referenced in those sections but not present in the `Session Table` will be filled with NaN values.
    * As a special Fix columns referenced in the config file under section [`Multiply`] will be multiplied by the column specified as key (i.e `Mobility: Duration` will multiply the column Mobility by the column Duration and store the result into Duration)
    * **PeriodTable**: Data is aggregated based on [MouseID, TrialCnt, PeriodType] following the config settings in [`TrialAggregation`]
    * **TrialTable**: A Second aggregation (on the original Session Table) is performed on the [MouseID, TrialCnt] in addition every column under the config settings in [`PeriodAggregation`] will be split into multiple columns for each extended Trial Type! As aggregation function the same as for the 'TrialAggreagtion' is used. One additional column is added 'Precue_ResponseRate' that is the average number of NosePoke onsets, by calculating `Precue_NosePoke_OnSet / Precue_Duration`
    * **ExtendedTable**: At last, 3 `extended Tables` are generated that contain the ResponseRate for ['MouseID', 'ExtendedTrialType'], ['MouseID', 'TrialType'], ['MouseID']. The Response Rate is caluclated counting the number of extended Trials and the following `Go_RR = Go_Correct / (Go_Correct + Go_Omission)`, `NoGo_RR = FalseAlarm / (FalseAlarm + NG_Omission + NG_Correct)`

* **Storing the results**
    * A hdf file will be generated containing the PeriodTable (as /T),
    * In addition CSV files containing the the tables (named PeriodTable.csv, TrialTable.csv, ExtentedTableByExtendedTrialType.csv, ExtTrialTableByTrialType.csv, ExtTrialTableByMouse.csv) 



#Implementation Details
Here is a series of explained functions

- **autoDetectResampleMethods**: based on the content of a column a aggregation function is defined. The default function is `mean()`, if the number
of unique values <= 2 than the `max()` function is used, and if  2 < #uniquevalues < 10 than the most common value is used. 

#Example
run something like

    >>> import IO
    >>> E = IO.CageViewer('test/Exp08/P04_S07_base_amph/160113_P04_S07_A1C1.mat', parseFileName='(?P<Phase>[A-Z\d]+)_(?P<Session>[A-Z\d]+)_(?P<MouseID>[A-Z\d]+)\.mat')
    >>> E.data
    #Show Table
    >>> E.meta
    #Shows meta data
    >>> E2 = E.resample('100ms')
    #E2 is the resampled data

    >>> R = IO.EthoVision('test/Exp08/P04_S07_base_amph/Raw_data-Exp_8_P04S07_Box1and2-Trial_1.csv', sep=',')
    >>> R.data
    #Show data
    >>> R.meta
    #Show meta data

    >>> R2 = R.resample('100ms')

    #Load a complete Session. This takes a very long time ...
    >>> T = IO.loadSession('test/Exp08/P04_S07_base_amph')

    #or load a previouse stored table
    #Store
    >>> T.to_hdf('T.hdf', 'T')
    
    #Load
    >>> store = pd.HDFStore('T.hdf')
    >>> T = store['T']

    >>> T
    #Show sessin data
    >>> T.query('MouseID == "A1C1"')
    #Show data about mouse A1C1

    >>> import TrialAnalysis
    >>> pT, tT, rrT = TrialAnalysis.TrialAnalysis(T, 'TrialAnalysis.yaml')
    pT ... period Table
    tT ... Trial Table
    rrT ... ResponseRate Table

    #Store result to file
    >>> tT.to_excel('TrialTable.xlsx')
    >>> tT.to_csv('TrialTable.csv')

